1 research outputs found

    Modelling Customer Behaviour with Topic Models for Retail Analytics

    Get PDF
    Topic modelling is a scalable statistical framework that can model highly dimensional grouped data while keeping explanatory power. In the domain of grocery retail analytics, topic models have not been thoroughly explored. In this thesis, I show that topic models are powerful techniques to identify customer behaviours and summarise customer transactional data, providing valuable commercial value. This thesis has two objectives. First, to identify grocery shopping patterns that describe British food consumption, taking into account regional diversity and temporal variability. Second, to provide new methodologies that address the challenges of training topic models with grocery transactional data. These objectives are fulfilled across 3 research parts. In the first part, I introduce a framework to evaluate and summarise topic models. I propose to evaluate topic models in four aspects: generalisation, interpretability, distinctiveness and credibility. In this manner, topic models should represent the grocery transactional data fairly, providing coherent, distinctive and highly reliable grocery themes. Using a user study, I discuss thresholds that guide interpretation of topic coherence and similarity. We propose a clustering methodology to identify topics of low uncertainty by fusing multiple posterior samples. In the second part, I reinterpret the segmented topic model (STM) to accommodate grocery store metadata and identify spatially driven customer behaviours. This novel application harnesses store hierarchy over transactions to learn topics that are relevant within stores due to customised product assortments. Linear Gaussian Process regression complements the analysis to account for spatial autocorrelation and to investigate topics' spatial prevalence across the United Kingdom. In the third part, I propose a variation of the STM, the Sequential STM (SeqSTM), to accommodate time sequence over transactions and to learn time-specific customer behaviours. This model is inspired by the STM and the dynamic mixture model (DMM); however, the former does not naturally account for temporal sequence and the latter does not accommodate transactions' dependency on time variables. SeqSTM is suitable for learning topics where product assortment varies with respect to time, and where transactions are exchangeable within time slices. In this thesis, I identify customer behaviours that characterise British grocery retail. For instance, topics reveal natural groups of products that are used in the preparation of specific dishes, convey diets or outdoor activities, that are characteristic of festivities, household or pet ownership, that show a preference for brands, price or quality, etc. I have observed that customer behaviours vary regionally due to product availability and/or preference for specific products. In this manner, each constitutional country of the UK, the northern and the southern regions of England and London show a preference for different products. Finally, I show that customer behaviours may respond to seasonal product availability and/or are motivated by seasonal weather. For instance, consumption of tropical fruits around summer and of high-calorie foods during cold months
    corecore